Best-Effort Refresh Strategies for Content-Based RSS Feed Aggregation
نویسندگان
چکیده
During the past several years RSS-based content syndication has become a standard technique for efficiently and timely disseminating information on the web. From a data processing perspective RSS feeds are standard XML resources which are periodically refreshed by feed aggregators for generating continuous streams of items. In this article, we study the problem of information loss in the context of a content-based feed aggregation system and we propose a new best-effort refresh strategy for RSS feeds under limited bandwidth. This strategy is evaluated experimentally and compared to other state-of-the-art crawling strategies for web pages.
منابع مشابه
Optimizing large collections of continuous content-based RSS aggregation queries
In this article we present RoSeS (Really Open Simple and Efficient Syndication), a generic framework for content-based RSS feed querying and aggregation. RoSeS is based on a data-centric approach, using a combination of standard database concepts like declarative query languages, views and multi-query optimization. Users create personalized feeds by defining and composing content-based filterin...
متن کاملRoSeS: A Continuous Content-Based Query Engine for RSS Feeds
In this article we present RoSeS (Really Open Simple and Efficient Syndication), a generic framework for content-based RSS feed querying and aggregation. RoSeS is based on a data-centric approach, using a combination of standard database concepts like declarative query languages, views and multiquery optimization. Users create personalized feeds by defining and composing content-based filtering...
متن کاملCobra: Content-based Filtering and Aggregation of Blogs and RSS Feeds
Blogs and RSS feeds are becoming increasingly popular. The blogging site LiveJournal has over 11 million user accounts, and according to one report, over 1.6 million postings are made to blogs every day. The “Blogosphere” is a new hotbed of Internet-based media that represents a shift from mostly static content to dynamic, continuously-updated discussions. The problem is that finding and tracki...
متن کاملReliability and Timeliness Analysis of Content-based Publish/subscribe Systems
Content-based Publish/subscribe systems (CBPS) is a simple yet powerful communication paradigm. Its content-centric nature is suitable for a wide spectrum of today’s content-centric applications such as stock market quote exchange, remote monitoring and surveillance, RSS news feed, and online gaming. As the trend shows that the amount of information along with its producers become astonishingly...
متن کاملAutomatic Content Syndication in Information Science: A Brazilian Experience in the Creation of RSS Feeds to e-journals
This paper reports the partial results of an exploratory study which intends to develop a methodology for a Web feed-based aggregation content service to electronic journals in Information Science. Ten scientific e-journals were chosen as sample to demonstrate the potential of the Web syndication technology. These e-journals are supported by the Brazilian Electronic Journal Publishing System (S...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010